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1 Introduction 

We study in this paper certain properties of the responses of dynamical systems to external in- 
puts. Our results are purely mathematical and thus of wide applicability, but our motivation arises 
from molecular systems biology. Indeed, the behavior, adaptability, and survival of organisms de- 
pends critically upon their capability to formulate appropriate responses to chemical and physical 
environmental cues. In particular, signal transduction and gene regulatory networks in individual 
cells mediate the processing of measured chemical concentrations and physical conditions, such 
as ligand concentrations or stresses, eventually leading to regulatory changes in metabolism and 
gene expression. Often, the ultimate goal of these changes is to maintain a narrow range of con- 
centration levels of vital quantities (homeostasis, adaptation) while at the same time appropriately 
reacting to changes in the environment (signal detection). Much theoretical, modeling, and analy- 
sis effort has been devoted to the understanding of these questions, traditionally in the context of 
steady-state responses to constant or step-changing stimuli. 

In this work, we are concerned with questions that complement the analysis of simple temporal 
inputs and steady-state responses, focusing on certain properties of transient behaviors, both for 
simple stimuli like step changes and for more complex time-varying input profiles. The study of 
transient responses is of a central concern in cell biology, since behavior at the time-scale of sig- 
naling may have important consequences for cell survival. Moreover, typical signals encountered 
by cells in their natural environments may well exhibit interesting temporal information, and thus 
characterizing responses to fluctuating temporal patterns may provide new insights regarding cell 
behavior. Such responses and the analysis of transient behavior, can also help rule out postulated 
mechanisms when tested through modem experimental tools which allow for fine spatiotemporal 
resolution in measurements. This type of model falsification is of course routine, but the broader 
framework allows for testing of a richer class of "phenotypes" and may thus help guide searches 
for yet-unknown molecular mechanisms. 

The immediate motivation for this work is the recent discovery of an important transient prop- 
erty, related to Weber's law in psychophysics: fold-change detection (FCD) in adapting systems, 
the property that scale uncertainty does not affect responses. FCD appears to play an important role 
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in key signaling transduction mechanisms in eukaryotes, including the ERK and Wnt pathways, 
as well as in Escherichia ecoli and possibly other prokaryotic chemotaxis pathways [[T]-[3|. The 
mathematical analysis of FCD was started in [|3[|4||. In this paper, we provide further theoretical 
results regarding this property. Far more generally, we develop a necessary and sufficient char- 
acterization of adapting systems whose transient behaviors are invariant under the action of a set 
(often, a group) of symmetries in their sensory field. A particular instance is FCD, which amounts 
to invariance under the action of the multiplicative group of positive real numbers. Our main re- 
sult is framed in terms of a notion which considerably extends equivariant actions of compact Lie 
groups. 

This paper is organized as follows. Section|2]introduces the main definitions and the statement of 
the main result of this paper. Section[3]explains illustrates how the main result can be used to check 
invariance in a number of simple examples. Section|4]has the proof of the main theorem, as well as 
a self-contained review of some key concepts needed from nonlinear control theory. Section[5]fills- 
in a stability proof that is needed in order to justify that several of our simple examples are indeed 
adapting systems. Section [6] compares feedforward and feedback architectures, in the context of 
the "internal model principle" of control theory. Section |7] provides a simple result showing that 
search strategies in sensory fields subject to symmetries are invariant, provided that the underlying 
system itself be invariant. 



2 Notations, definitions, and statement of main theorem 

We study dynamical systems with inputs and outputs: 

z = F{z,u), y = hiz), (1) 

where F, h are functions which describe respectively the dynamics and the read-out map. Equa- 
tion ([T]) is meant as shorthand for 

^(t) = F(z(t),u(t)), y(t) = h(z(t)). 
dt 

Here, u = u(t) is a generally time-dependent input (also called, depending on the context, a "stim- 
ulus" or "excitation") function, is an ^-dimensional vector of state variables, and y{t) is the 
output ("response" or "reporter" variable). 

The paradigm of studying systems in the form ^ is standard in in control systems theory [|5|. 

Typically, y(t) is just a read-out of one of the components, Zi(t), of z{t). However, it is also 
sometimes natural to take a more complicated function y{t) = h{z{t)) of the coordinates of z than 
just picking an individual z,. For instance, suppose that z\{t) represents the concentration of the 
free form of an enzyme E, that ziit) is the concentration of E complexed with some substrate, and 
that these two species are indistinguishable by a Western blot assay measurement. Then the sum 
y = h{z) = Z\ + Z2 might be the reporter variable of interest. More generally, the theory does not 
change substantially if we allow the output variable y{t) to be a function of the current input as 
well as on the current state, y = h{z, u). It is notationally and technically more convenient to take 
y = h(z), so we do that here. In any event, one could add a new variable Zn+i to z = (zi, . . . ,z„) 
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and a differential equation szn+i = h(z, u) - Zn+i which (provided < £ 1) quickly equilibrates 
to 3; = Zn+\ = h(z, u), and with this small modification, now y depends only on a single coordinate, 
of an extended state vector (zi, . . . , and does not directly depend on the current input. 

In order to describe positivity of variables as well and other constraints, we introduce the fol- 
lowing additional notations. States, inputs, and outputs are constrained to lie in particular subsets, 
which we call Z, U, and Y respectively, of Euclidean spaces W,W,W. For example, U = R>o 
means that the input values must be scalar (m = 1, U c R') and positive. The functions f,h are 
differentiable. We will assume that for each piecewise-continuous input u : [0, 00) U, and each 
initial state ^ 6 Z, there is a (unique) solution z : [0, 00) ^ Z of ([T]) with initial condition z(0) = ^, 
which we write as 

V>(t,^, u), 

and we denote the corresponding output y : [0, 00) Y, given by h{ip{t, ^, u)), as 

il/{t, ^, u) 

(see [|5| for more discussion, properties of ODE's, global existence of solutions, etc.). 

Since we are interested in adapting systems, will assume that for each constant input u{t) = u, 
there is a unique steady state of the system (which depends on the particular input). We denote 
this steady state by (t{u). In other words, z = cr(u) the unique solution of the algebraic equation 
F{z, u) = 0. Finally, we will assume that this steady state is globally asymptotically stable (GAS). 
This means that (t(m) is Lyapunov stable and globally attracting for the system when the input is 
u(t) = u: 

\im (p(t,^,u) = cr{u) 

for every initial condition ^ e Z. Multi-stable systems may be considered as well, of course, but 
the definitions to follow become very cumbersome. 

We will illustrate our results using the two sets of examples that are shown in Figs. [T] and [2} In 



u u u u 




X = a(y - yo) x = a(y - yo) ^ ~ '^^^y -^o^ x = ax(yo - y) 

y = fiu - fix -yy y = filn u - fix - yy y = P yy y = ySwx - yy 

X 

(a) linear (b) loglinear (c) nonlinear I (d) nonlinear II 

Figure 1: Integral feedback systems (assuming w > in (b), and u,x > O'm (c,d)) 

the equations shown in these figures, the vector z = (zuZi) = (x,y) has dimension n = 2 and the 
output is the second component, h{z) = y. In other words, in these examples, F{z, u) = F{x, y, u) is 
a vector function with two components, which for n = 2 we write as {f{x,y, u),g{x,y, u)). We have 
that /(x,y, u) is a(y-yo) in Fig.[T];a,b), ax(y-yo) in Fig.[T];c,d), and au-6x, inFig.|2} and g{x,y, u) 
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— u 

X = au - 5x 1 X = au - 6x 

u X 
y = /3--yy y = /3u - jxy 

1 

(a) intermediate inhibits formation L^y (b) mtermediate promotes degradation 

Figure 2: Incoherent feedforward loops (IFFL) (assuming u,x > O'm (a)) 

is /3u-/ux- yy, in Fig. [TJa), filnu-jix- yy, in Fig. [Tfb), 13'j, - yy in Fig. [Tfc) and ¥ig.^d),l5ux - yy 
in Fig.[T];d), and pu - yxy in Fig.|2]^b). The constants a,p, ... are positive numbers. We emphasize 
that our theory is valid for any dimension n; these examples are only picked for illustration. 

2.1 Perfect adaptation 

Definition 2.1 The system ([T]) perfectly adapts to constant inputs provided that the steady-state 
output h{cr(u)) equals some fixed yo 6 Y, independently of the particular input value m e U. □ 

Adaptation (we will just say "adapt" instead of "perfectly adapt to constant inputs" in what fol- 
lows) means that the steady-state output value is independent of the actual value of the input, if 
the input is constant. This property may be achieved by differentiating the input signal before it is 
further processed by the system. However, differentiation is very sensitive to high-frequency noise, 
and in fact there is no need for differentiation to be explicitly performed: there are several alter- 
native mechanisms, such as those represented in Figs. [T] and [2[ integral feedback and incoherent 
feedforward loops respectively, that are also capable of achieving adaptation. 

In integral feedback systems, an internal "memory" variable keeps track of the accumulated 
(i.e., integrated) difference between the current value y(t) of the response variable and its desired 
steady-state value yo. A difference, or a nonlinear comparison such as a ratio, is performed between 
this memory variable and the current input, thus providing an "error" signal that is used to drive 
the feedback mechanism that brings the system back to its default value. Integral feedback is 
recognized as a key feature of perfectly adapting biological systems, both at the physiological and 
cellular level, such as, for example, in blood calcium homeostasis [[6|, in neuronal control of the 
prefrontal cortex ||7], in the regulation of tryptophan in E. coli fS], and in E. coli chemotaxis 

Fig.[T]^a) shows the linear integral feedback configuration (PI, or proportional-integral, control) 
that is classically treated in control theory. When u is constant, the unique steady state of this 
system is given by ^ = (J3u - yyQ)/p and y = yo. Thus, this system adapts: the steady state value of 
y is independent of the input u. Moreover, since the eigenvalues of any matrix of the form 




are negative, it is clear that this system is globally asymptotically stable: x(t) x and y(t) y = 
yo as t ^ oo. (Taking a and fi both negative instead of positive gives the same conclusions.) Two 
other integral feedback configurations, also perfectly adapting, are shown in Fig.[T] In (b), a "log- 
linear system," the only difference with (a) being that the input is logarithmically pre-processed; 
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this does not change the conclusions of adaptation and stability. In system (c), the memory variable 
feeds upon itself, and the ratio u/x, instead of a difference, is used to compare the current input 
and memory values. In system (d), the memory variable also feeds upon itself, and the product 
ux is used in the feedback term to y. Both (c) and (d) adapt (to y = y^), and x = /3fi/(yy) in (c), 
X = yylifiix) in (d). Stability is a bit more subtle, and is based on a control-Lyapunov approach |5l 
that recasts (c) as a Hamiltonian system with added damping, see Section |5] (The ratio "w/x" 
in (c) is not a natural choice for biological models; however, one may think of this term as an 
approximation of a Michaelis-Menten inhibition term u/{K^ + x), with K,,, 1.) 

A different type of architecture is based on feedforward as opposed to feedback interconnec- 
tions. Feedforward circuits are ubiquitous in biology, as emphasized in [ITO|, where they were 
shown to be over-represented in E. coli gene transcription networks, compared to other "motifs" 
involving three nodes. Similar conclusions apply to certain control mechanisms in mammalian 
cells [TTl. A large number of papers have been devoted to the signal-processing capabilities of the 
feedforward motif, notably [12] which looked into its properties as a "change detector" (essentially, 
sensitivity to changes in the magnitude of the input signal), and [13 J which studied its optimality 
with respect to periodic inputs. Comparisons with other "three node" architectures with respect to 



the trade-off of sensitivity versus noise filtering are given in [ 14|. Other references on feedforward 



circuits include [15| (showing their over-representation at the interface of genetic and metabolic 
networks), [ |16| (classification of different subtypes of such circuits), and [ |17| (classification into 
"time-dependent" versus "dose-dependent" biphasic responses, which are in a sense the opposite 
of adapted responses). 

In particular, in incoherent feedforward loops (IFFL), as in Fig. [2} the input u directly helps 
promote formation of the reporter y and also acts as a delayed inhibitor, through an intermediate 
variable x. This "incoherent" counterbalance between a positive and a negative effect gives rise. 



under appropriate conditions, to adaptation. The reference [117 1 provides a large number of inco 



herent feedforward input-to-response circuits, which participate in EOF to ERK activation p8p9| , 
glucose to insulin release [20,21], ATP to intracellular calcium release [22 23 1, nitric oxide to NF- 
kB activation [24[, microRNA regulation [25[, and many others. Several varieties of IFFL circuits 
have been often proposed for perfect adaptation to constant signals in biological systems. Notably, 
the IFFL shown in Fig.[2jb), often called the "sniffer" [ |26|[27| , appears in slightly modified forms 
in models for Dictyostelium chemotaxis and neutrophils ['28l,'29l, microRNA-mediated loops [ |30J , 



and E. coli carbohydrate uptake via the carbohydrate phosphotransferase system [31 1 and other 
metabolic systems [[32|. The work [33[ shows experimentally and analytically that IFFL's are 



especially well-suited to controlling protein expression under DNA copy variability. For both sys- 
tems in Fig.[2| the unique steady state, when the input u is constant, has coordinates x = au/6 and 
y = yo = fi6/{ay). Since y is independent of u, the system adapts. Global asymptotic stability for 
(a) follows from the fact that the ^-subsystem is linear and stable, and the j-subsystem is a stable 
linear system driven by the converging signal u/x. For (b) (and several variations of this system). 



the GAS property is studied in [27 1. 
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2.2 Invariance 



As mentioned in the introduction, we wish to study the invariance of outputs under input trans- 
formations. The original motivation was the study of the particular case of scale invariance, which 
is described through the following thought experiment. 

2.2.1 A special case: scale-invariance (FCD) 

Suppose that a system that adapts has had a chance to "pre-adapt" to a certain constant ("back- 
ground") level u of the input, for t < 0, and that now we present the system with the new input 
u{t), t > 0. Let yi(t) be the output function that results. Next imagine that we allowed the same 
system to pre-adapt, instead, to pu for t < 0, and we now present the system with pu(t) for t > 0, 
where p is some positive scalar. Let j2(0 be the resulting output. Scale invariance means that the 
outputs of the two experiments should be the same: yi{t) = y2(t) for all t > 0. In other words, 
for any two inputs u(t) and pu{t), as in Fig. |3} and no matter what positive number or "scale" p 




ufi) 



t = t 



Figure 3: Two scaled inputs 

we picked, the entire shape of the response, including amplitude and duration, is identical. As an 
example, a step change in input from, say, a constant level 2 to a constant level 4, should result in 
precisely the same output as a step from constant level 5 to constant level 10 (we scaled everything 
hy p = 2), Fig.Qd). On the other hand, a change from, say, level 5 to 25 (which has a fold-change 
of 5, instead of 2) will typically lead to a different response. Thus, another way to describe this 
invariance property is to say that the only potentially detectable differences in response are due to 
fold changes, and this motivates the name fold-change detection (FCD) [[3}|4|. FCD represents a 
particular type of adaptation, one in which there is with robustness to scale uncertainty, and it can 
be found at many levels of biological organization. A weak version is present in the Weber- Fechner 
law logarithmic sensing feature in psychophysics: many sensory systems (for weight, vision, hear- 
ing) produce responses whose maximal amplitude only depends on the ratio between the stimulus 
and a background or starting value. Fig. ^c). It was also recently discovered that the transient 
responses of several biological cellular signaling systems [[1]|2| display FCD features. 

We can formally define the FCD property as follows. Denoting by ";rM" the input 1 1-^ pu{t), and 
by "ttm" the constant pU, the equality of outputs 

tf/(t,cr(U),u) = tf/{t,cr(nU),7ru) V?>0 (2) 

should hold for all possible p > 0, as well as all constant inputs m 6 U and inputs u = u{t) 
with values in U. (Observe that the requirement that u and u take values in U serves to impose 
constraints such as, for example, positivity.) 



6 



u 

ufi) = pup) 

I ^) 

t = t t = t t = t t = t 

(a) scaled inputs (b) adapting system (c) Weber-like (d) FCD 

Figure 4: In an adapting system, the response to scaled steps (a) returns to the same adapted 
value, but may differ in time (b), or may produce the same maximal response (c) or even identical 
transients (d)) 

Equipped with this formalism, we may study a far more general question, namely invariance 
with respect to any set of transformations, not merely scalings but also far more general invariances 
to sensory field changes: translational, rotational or reflection symmetries, anisotropic dilations, 
projective transformations, and so forth. 

2.2.2 General case 

Suppose given a set "P of continuous and onto input transformations ;7r : U ^ U: 

{tt : U ^ U , TT e !P} . 

For an input u{t), we denote by "/rw" the function of time that equals n{u{t)) at time t. (For no- 
tational simplicity, we write "ttm" rather than ";r(M)" when there is no risk of confusion, but we 
emphasize that there is no requirement that the mapping n be linear.) The continuity assumption is 
only made in order to insure that nu is a piecewise continuous function of time if u is, as techni- 
cally required when using it as an input to a differential equation. The ontoness assumption, that 
is, nU = U, is made for technical reasons in the proof of the main theorem. 

Definition 2.2 The system ([T]) has response invariance to symmetries in f or, for short, is P- 
invariant if ([2]) holds for all t > 0, all inputs u = u(t), all constants U, and all transformations n ef. 
□ 

Under the assumption that that the action of P is transitive, i.e., for any two m, v 6 U, there is 
some n such that v = nii, !P-invariance implies perfect adaptation, because the outputs in ([2]) must 
coincide at time zero, and any two inputs can be mapped to each other. 

Examples: In the special case of the transitive action with f = U = R>o and nu = pu, we obtain 
the FCD property. Another example of a set of symmetries P is as follows. Suppose that U = R" 
and P consists of all the transformations nR{u) : = Ru, forR e SO (n), the special orthogonal group. 
That is, we transform inputs by multiplication with a rotation matrix R. In dimensions 2 or 3, this 
can be used to impose the requirement that a visual sensing system should react the same if the 
visual field is rotated. Another slightly different example would be that in which 'R 6 0(n), the 
orthogonal group, meaning that we want invariance with respect to reflections as well as rotations. 
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2.3 Equivariances 



We now introduce a concept that leads to an effective criterion for checking FCD, and more 
generally arbitrary ^-invariance, without having to compute the solutions \j/{t, (T{nu), nu). We make 
the definition for arbitrary systems as in ([T]), z = F(z, u), y = h{z). 

Definition 2.3 Given a system ([T]) and a set of input transformations f , a parametrized set of 
differentiable mappings {p;^ : Z ^ is a 'P-equivariance family provided that, for each n: 

F(p„iz), nu) = p'^iz)Fiz, u) and h(p^iz)) = h{z) for all z e Z, m e U (3) 

(p^ = Jacobian matrix of pj^). If (|3]) holds, we say that the system is p^j-equivariant under the input 
transformation n. □ 



The first part of Equation ([3]) is a first order quasilinear partial differential equation on the n 
components of the vector function p„, and the second part is an additional algebraic constraint on 
these components, for all u e U. We omit the subscript n when clear from the context. Quasi-linear 
first-order PDE's appear in related questions in control theory, for instance [5 1 in Hamilton- Jacobi- 
Bellman's approach to optimal control and in feedback linearization based on Frobenius' Theorem. 
They may be in principle solved using the method of characteristics [34|. 

Our notion of equivariance generalizes a mathematical concept fundamental in group theory, 
and specifically in the symmetry analysis of nonlinear dynamical systems [j35,36| with the same 
name. A parametrized vector field F is said to be p-equivariant, or p is a symmetry of F, if, for 
each solution z(t) of z = F(z, u), p{z{t)) is also a solution. This property is equivalent to the PDE 
F(p(z), u) = p'{z)f{z, u) [ [35| that is part of the equivariance definition, when n = identity. We 
are generalizing this concept in several ways. First of all, we must consider the far more general 
case n 4^ identity; in fact, in our context, tt = identity is not of any interest whatsoever, since 
we are precisely interested in the effect of the input transformations. Second, it is essential to our 
definition that we include the algebraic "boundary condition" h{p-^{zS) = h{z). Finally, in dynamical 
systems one typically studies equivariances only with respect to Lie group actions. Moreover, since 
compact Lie groups acting on Euclidean space can be identified with subgroups of the orthogonal 
group 0(n), one finds in the classical definitions p5| only linear maps p'(z) = yz with y e 0(n) 
being considered (so p'(z) = y). 



2.4 Statement of main theorem 

Our main result is as foUows. It is proved in Section |4j 

Theorem 1 An analytic and irreducible system is P-invariant if and only if there exists a P- 
equivariance family. 

An analytic system is one for which all the functions defining the system are real-analytic on 
the state coordinates. This means that they can be expanded into locally convergent power series 
around each point. Every function made up of elementary algebraic compositions of elementary 
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functions is analytic; this includes all expressions involving polynomials and, more generally, 
any well-defined (no poles) rational functions (so mass-action kinetics and Hill-type models with 
any integer Hill coefficients are allowed) as well as trigonometric functions, exponentials, and 
logarithms in any combinations. All our examples are analytic, as long as we restrict expressions 
such as m/x to X > (or X < 0), so that there are no poles. 

An irreducible system is one for which no conservation laws restrict motions to proper sub- 
manifolds (accessibility property) and no pairs of distinct states give rise to the same input/output 
behavior (observability property). We define precisely accessibility and observability in Section|4j 
Irreducibility is a weak technical assumption; we will show that these two properties hold for all 
the systems in Figs. [T] and |2] Irreducible systems, which are also called "minimal" or "canonical" 
in the control theory literature, are minimal, in the sense that no lower-dimensional subsystem has 
an identical input/output behavior p p7| - [39| . Analogous, but simpler, notions of irreducibility also 
appear in areas such as group representations (Schur's Lemma). 



3 Examples of finding symmetries using the main theorem 

We show here how Theorem [T] allows one to immediately determine invariance properties for 
large classes of two-dimensional systems, including the integral feedback and feedforward exam- 
ples shown in Figs. [T] and |2j In all these cases, the PDF for equivariances, if there is one, can be 
easily solved for in closed form. We consider two-dimensional systems with output equal to one 
of the coordinates. We write z = {x,y), F(z, u) = {f{x,y, u),g{x,y, u)): 

X = f(x,y,u) 
y = g(x,y,u) 
h{x,y) = y 

and wish to determine for which possible input set mappings ;r : U ^ U there is an associated 
equivariance p = p,^. We drop the subscript and write p = (p^,p-'). Since h{x,y) = y, the condition 
h(p{x)) = h(x) says that p^'(x,y) = y. Thus finding p is equivalent to finding its .x;-component, a 
function p" that satisfies: 

dp' 

f(p''(x,y),y,nu) = —(x,y)f(x,y,u) 

ox 

g(p\x,y),y,nu) = g{x,y,u) 

(no derivative in the second equation because, dp^'/dy =1). This is a scalar first order quasi- linear 
PDF subject to a side algebraic "boundary condition". 

We specialize next to special two cases that cover many examples of interest. Particular instances 
of the first case are the systems in Figs. [T]^c,d) and|2];a). A particular instance of the second case 
is the system in Fig.[T]^a). We assume that the systems in both of the next Lemmas are irreducible. 
For all our examples, irreducibility is checked in Section |4j 

Lemma 3.1 Suppose that: 

g(x,y,u) = G{ul^x^',y) 
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and G(-,y) is one-to-one for each fixed y. (Assuming ;c > if < or m > if yS < 0.) Then, the 
only possible symmetries are fold-changes nu = pu. Furthermore, the system is invariant under a 
set P of such symmetries if and only 

p~'^'^f(x,y,u) = f(p~^'^x,y,pu) forallx,3;,M 

and each p in the set. 

In the special case in which j3=l and //=-l, that is, if g depends on the ratio u/x, this means that 
/ must satisfy: 

pf(x,y,u) = f(px,y,pu) 

and in the special case /3=iJ.= l, f must satisfy p~^f(x,y,u) = f(p~^x,y, pu). In either special 
case, if / is independent of u, then response invariance to all scaling transformations (P = R>o) is 
equivalent to the requirement that / be be homogeneous of degree 1 in x. 

Proof. Since G is one to one on y, 

G{(7ruf(p''(x,y)y,y) = g(p\x,y),y,nu) = g(x,y,u) = G{i/:d',y) forallx,3;,M 
implies that: 



or, equivalently: 



Define 



inuf(p%x,y)y = for all x,y,u 



P 



Uq 

for any fixed but arbitrary element uq g U. It follows that 



nu = pu and p^(x, y) = p ^'^x for all x,y,u, 
from which all the conclusions are immediate. ■ 

Lemma 3.2 Suppose that: 

g{x,y,u) = G(px+fiu,y) 

and G(-,y) is one-to-one for each fixed y. Then, the only possible symmetries are translations 
nu = p + u. Furthermore, the system is invariant under a set V of such symmetries if and only 

fix,y,u) = fi-(fi/ix)p + x,y,p + u) for all x,y,u. 

and each p in the set. 
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Proof. Since G is one to one on y. 



G(jup%x,y) +finu,y) = g{p\x,y),y,nu) = g{x,y,u) = G(jux+l3u,y) 

implies that: 



or, equivalently: 



Define 



jj.p^{x, y) + I^Tiu = fix + /3u for all x, y, u 



nu-u = --{p^{x,y) - x) forallx,y, m. 



p := nuQ - uo 

for any fixed but arbitrary element Mq ^ U. It follows that 

J3p 

nu = p + u and p^(x, y) = \- x for all x,y,u, 

from which all the conclusions are immediate. ■ 
We can now quickly classify the examples shown in Figs. [1] and |2j 

The linear integral feedback system in Fig. [TJ^a) fits the form in Lemma 3.2[ so it can only 
be ^-invariant with respect to transformations u p + u, and the only possible equivariance is 
p^(x,y) = X + fip/iu. Since f(x,y, u) is independent of x and u, this is indeed an equivariance. This 
this system is ^-invariant with respect to translations. 



The systems in Fig. [T]^c,d) and Fig. [2]^a) all fit the form in Lemma 3.1[ so they can only be "P 



invariant with respect to scaling transformations u ^ pu, and the only invariance is equivalent to 
the condition 

p''f{x,y,u) = f{p''x,y,pu) 

where e is +1 and -1 for the systems in Fig. [TJc,d) respectively, and is +1 for the system in 
Fig. ^a). In Fig. [T]^c,d), the value of s is irrelevant, because f{x,y, u) is independent of u and is 
homogeneous of degree 1 in x, so the property holds. In Fig.[2];a), / is homogeneous of degree 1 in 
X and u simultaneously, so again the property holds. In summary, all three systems are "P-invariant 
with respect to scalings "P. 

The log-linear system in Fig.[TJb) is also ^-invariant for the set of scalings. This may be shown 
with the equivariance p''{x,y) = x + /3\np/fi. 

We remark that, generalizing Fig.[T];^a) and Fig.[2]^a), any n-dimensional linear system x = Ax+bu 
with a stable A and h(x) = cx such that cA'^b = (i.e., its DC gain is zero) is ^-invariant for 
a 1-^ p + u, with p(x) = X - A'^bp. The corresponding log-linear system, in which i = Ax + blnu, 
is invariant with respect to scalings. 

Finally, we study the "sniffer" IFFL shown in Fig. ^h). The equation "'g(p^(x,y),y,7Tu) = 
g(x, y, m)" means that /3nu - yp^(x, y)y = /3u - yxy for all x, y, u, and thus evaluating at j = it 
follows that nu = u (assuming 0). So no nontrivial ^-equivariance exists. By the necessity 
part of Theorem [T| we conclude that this system is not V -invariant for any possible V. 
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Remark 3.3 Interestingly, although not invariant, the system in Fig. ^h), as well as many other 
examples from [ j27| , satisfy an "approximate" invariance property, in the following sense. Suppose 
that the y variable varies faster than the x variable, so that one may make a quasi-steady state 
approximation /3u - yxy = 0. This allows one to reduce to a one-dimensional system x = au - Sx 
with output y = (J3/y){u/x). Scalings u pu and x ^ px preserve the equations, thus showing 
that the reduced system is ^-invariant with respect to scaling. (More generally, given a linear 
system with an output that depends on the ratio Xj/Xj of two variables and xj, one obtains scale- 
invariance.) This means that the original system is "close" to having the scale invariance property. 
A precise statement can be made using singular perturbation theory. This observation was made 
several years ago in the context of models of Dictyostelium chemotaxis and neutrophils pO| . □ 

Remark 3.4 An example with vector inputs is as follows. Suppose that we consider a vector 
integral feedback system of the following form: 

X = iy- yQ)x 
y = G{{u,x),y) 

with output y. The state-space Z is R"^^: x and u are n-dimensional real vectors and y is scalar, 
and (m, x) denotes the inner (dot) product of x and u. We claim that this system is "P-invariant 
with respect to the rotation group V = SO{n). Indeed, for each R 6 SO{n) we may define the 
equivariance pii{x,y) by mapping {x,y) to the n -\- 1 -vector with first n components Rx and last 
component y. Since pn is linear, and its partial derivative with respect to j is 1 and its Jacobian 
with respect to the x variables is R, we only need to check that R{y - yQ)x = {y - yo)Rx, which 
is true because y - yo is a. scalar, and that G({Ru,Rx),y) = G{{u,x),y), which is true because 
R is an special orthogonal matrix. The exact same proof works for the larger orthogonal group 
(reflection/rotations) 0{n). We can also generalize to the case when V consists of completely 
arbitrary nonsingular transformations {R e GL{n)). In that case, we would take Pr{x) = {R^)~^x 
(inverse transpose) and use that (Ru, (R^y^x) = (u, x). □ 

4 Details and proof of main theorem 

The main theorem is stated for systems that are irreducible, meaning that the both the accessibil- 
ity and the observability properties must hold. We define precisely and discuss these two properties 
next. 

4.1 The accessibility property and the accessibiUty Lie algebra 

In order to define accessibility, we will need to employ the notion of accessibility Lie algebra 
associated to a system ([T]), which we briefly review here; see [5] for further details and basic 
properties. Recall first the notion of Lie bracket of two vector fields on a manifold, specialized 
here to open subsets of Euclidean space R": for any two differentiable vector fields /,g : Z ^ R" 
defined in an open set Z c R", their Lie bracket is the new vector field 

:Z^R" 
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defined by the formula 



rr . dg df 
U,g\ ■= -^j - 

ox ox 



When /, g are twice differentiable, [/, g] is again differentiable, and one can take further brackets 
such as [[/, g\, g\. More generally, if the vector fields are smooth (that is, infinitely differentiable), 
one may take any number of iterated brackets. 

For any subset of smooth vector fields T on Z, the Lie algebra generated by T, denoted as ^Fla^ 
is defined as the intersection of all the Lie algebras of vector fields which contain T . (The set of 
all such algebras is nonempty, since it includes the algebra of all vector fields on Z.) Since the 
intersection of any family of Lie algebras is also a Lie algebra, it follows that !Fla is the smallest 
Lie algebra of vector fields which contains the set T, and it coincides with the set obtained as 
follows. Denote Tq := T, and, recursively, Tk+\ '■= {\_f,g] \feT'k,g& T], = 0, 1,2, . . ., as well 
as Too '■= Uk>o'^k- Then, Tla is equal to the linear span of Too- (See [5], Lemma 4.1.4 for a 
proof of this equality.) In summary, every element in the Lie algebra generated by the set T can 
be expressed as a hnear combination of iterated brackets of the form 

[^■••,[/3,[/2,/l]...]], 

for some / e T. 

If the system ^ is such that the state-space Z is an open subset of R" and all the vector fields 
T = {f{-,u),u e U} are smooth, we define its accessibility Lie algebra as !Fla- For each zq 6 
Z, one may consider the subspace !Fla(2o) := {^izQ),X e Tla) of K-"- The accessibility rank 
condition is said to hold for the system if 

TLAizo) = R" for every zo € Z . 

For analytic systems, the accessibility condition is equivalent to the property that the set of points 
reachable from any given state z has a nonempty interior, as follows by a result of Sussmann [37] 
that generalizes Nagano's theorem [ |41| on integrability of involutive families of vector fields; see 
a proof and more details in the textbook [l5l . 

In the special case of input-affine systems, i.e. those defined by differential equations 

X = go{x) + uigi(x) + . . . + u,„g,„{x) (4) 

where gi, i = 0, . . . , m are m + 1 vector fields, it is not necessary to use all the vector fields in T 
when generating Tla, since ThAizo) = {go, . . -^gmhAizo) for all zo- See [j5|. Lemma 4.3.3 for a 
proof. (The technical condition in that Lemma that zero must belong to the input set U fails when 
inputs are asked to be strictly positive, as in many of our applications. However, the Lemma also 
holds when U consists of positive inputs, since a continuity argument makes clear that one obtains 
the same spans if is added to U.) 

For example, we show next that all the systems in Figs. [T] and [2} which are all input-affine (with 
m = 1), satisfy the accessibility rank condition. 

X = a{y - yo), y = fiu - fix - yy. Here: 

a(y - Jo) \ / \ r 1 / -^1^ 



[8o,gi] = 
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Since the determinant of {gi, [go,gi]) equals afi^ at every x, the accessibility rank condition 
holds. 

X = a(y - yo), y = p\nu - jjix - yy. This is the same as the previous case, in so far as the accessi- 
bility condition is concerned, because the set of vector fields T is identical to the previous one. 

X = axiy - yo), y = fi'j - yy. Here: 

(ax(y-yo)\ /0\ r , ( '^P 



Since the determinant of {gi, [gQ,gi]) equals a/3^/x at every x, the accessibility rank condition 
holds. 

X = ax(yo -y),y = jSux - yy. Here: 

_ / ax{y-yo) \ _ / \ . . _ I -aj3x^ 

= \ -yy j ' - j ' ^'^ = [ aMy - yo) + Pyx 

Since the determinant of (^i, [go,gi]) equals a/3^x^ at every x, the accessibility rank condition 
holds. 

X = au - 6x,y = jS'-^ - yy. Here: 

Since the determinant of igi,[go,gi]) equals ^ at every x, the accessibility rank condition 
holds. 

X = au - 6x,y = fiu - yxy. Here: 

^° = ( -yxy ) ' = ( ^ ) ' ^^"'^^^ = ( ar/+V ) ' ^^"'^'^ = ( ) " 

Since the determinant of (^i, [^i, [^o>,?i]]) equals 2a^/3y at every x, the accessibility rank 
condition holds. 



4.2 The observability property 

A system ([T]) is said to be observable, or to have the observability property, if no two distinct 
states can give rise to an identical temporal response to all possible inputs. Formally: 

ifj{t,zo,u) = if/(t,zo,u) 'iu,t => Zo = Zo- 

For analytic input-affine systems (|4]) with output h = (hi, . . . , hp), one can restate the observability 
property as follows. The observation space O associated to the system is the vector space spanned 
by the set of all functions of the type: 
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(called the elementary observables of the system) over all possible sequences /i, . . . , ik, k > 0, out 
of {0, ... , m} and all j = I,. . . ,p, where LxH = VH ■ X is the directional or Lie derivative of the 
function H with respect to the vector field X and one understands LyLxH as the iteration Ly(LxH). 
We include the case in which k = 0, in which case the expression in ([5]) is simply hj. Two states 
Zi and zi are said to be separated by O if there exists some H e O such that H(zi) 4^ li{zi)- 
Observability is equivalent to the property that any distinct two states can be separated by the 
observation space. See [j5|, Remark 6.4.2 for a proof and discussion. 

For example, it is very easy to see that all the systems in Figs. [T] and [2[ which are all input-affine 
(with m = 1 and p = 1), satisfy the observability condition. We must prove that if //(zi) = Hizi) 
for every elementary observable Q, then z\ = Zi- Since already with ^ = we have that y\ = 
h(z\) = h{z2) = yi, it only remains to show that some linear combination of observables gives a 
one-to-one function of x. Note that Vh = (0, 1), so the dot product LgH = VH.g simply picks out 
the second coordinate of H. 

X = a(y - yo), y = fiu - fix - yy. Here: 

^' = [^x-yyl^ ^i=(^). L,,h=Mx-yy. 
Since Lg^h + yh = fix is one-to-one on x, the observability condition holds. 

X = a{y - yo), y = jSlnu - /ux - yy. This is the same as the previous case, in so far as the ob- 
servability condition is concerned, because tff(t, zo, u) is the same as (/r(?, zq, log u) for the previous 
system. 

X = axiy - Jo), y =P'i-7y- Here: 

ax{y-yo) \ „ / \ j . ^ 

Since Lg^h is one-to-one on x, the observability condition holds. 
X = axijQ -y),y = jSux - yy. Here: 

/ ax(y-yo) \ / \ r , n 

8o = [ j, S^=[fsxj^ Lg,h=/Sx. 

Since Lg^h is one-to-one on x, the observability condition holds. 
X = au - 6x,y = fi^ - yy. Here: 

Since Lg^h is one-to-one on x, the observability condition holds. 
X = au - 6x,y = j3u - yxy. Here: 

_^^y U = ( ^ ) ' ^soh = -yxy, Lg^Lg^h = -ayy-l5yx. 
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Since Lg^Lg^h + ayhy = -fiyx is one-to-one on x, the observability condition holds. (Observe that 
we cannot argue simply with Lg^h, because the function -yxy is not one-to-one on x in the special 
case y = 0.) 



4.3 Proof of Theorem [I] 

We must prove that, for analytic and irreducible systems, existence of a ^-equivariance family 
is sufficient as well as necessary for !P-invariance. 

Sufficiency. 

Suppose given an e P and an associated equivariance p = pp. We claim that the steady-state 
mapping cr interlaces n and its associated p = p^, in the sense that 

picriu)) = (t{7i{u)) 

for every m 6 U. Indeed, from the property F(p(z),nu) = p'(z)F(z,u), applied with any constant 
input u{t) = u, and z = cr{u), it follows in particular that F{p{(T{u)),nu) = p'(cr(u))F(cr(u),u). 
Now, F{cr{u),u) = 0, by definition of cr{u), so also F{p(o-(u)),nu) = 0, which this means that 
p{a{u)) is the steady state (T{nu) corresponding to the constant input nu, as we claimed. 

Now suppose that z{t) = (p(t, cr(U), u) solves z = F(z, u) with initial condition z(0) = a{u). 
Consider z*{t) = p„{z{t)). Computing the derivative z*{t), and using the chain rule: 

z.{t)=p'Mtmt)=Pn{z{t))F{z{t),u{t)) = F{p„{z{t)),nu{t)) = F{z.{t),nu{t)) 

Moreover, z*{0) = p„{cr(u)) = cr{nu), by the interlacing property. It follows that z* is the solution 
with initial condition correcponding to the "preadapted" value ainu) and the input n{u{t)), i.e. 
Zt{t) = (p{t, cr{7Tu), nu). We conclude that 

ifj{t,cr{u),u) = h{z{t)) = h{p„{z{t))) = if/it, ainu),nu). 



Necessity. 

Suppose given an analytic and irreducible system that is ^-invariant. Fix any n e f. We must 
find a differentiable mapping p = p„ : X ^ X such that (|3]) holds: F{p{z),nu) = p'{z)F{z,u) and 
h(p(z)) = h(z). Let us consider a modified system in which G(z,u) = F(z,nu) and same output 
k(z) = h(z). Then ^ asks that 

p'iz)F(z, u) = Gipiz), u) , k(p(z)) = Kz) V z 6 Z , M 6 U . (6) 

In the language of [37 1, property ([6]) says that p should be an isomorphism between the two sys- 
tems: 

z = F{z,u), y = h(z) (7) 

and 

z = G(z,u), y = k(z) (8) 

or, equivalently, that the diagram in Fig. [5] should be commutative, where we write "z • u/' and 
"z • nu" to denote the states <p(t, z, u) and ^(?, z, nu) respectively. 
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z ■ Ut 



Pn 



Pn 



Y 



z ■ nu 



Figure 5: Equivariance as commutative diagram 



Paraphrased in our language, Theorem 5 in |37| asserts the following. Suppose given any two 
analytic and irreducible systems ^ and ([8]), and two initial states zo and To respectively, with the 
property that the input/output behaviors are the same: 

ll/{t,Zo,U) = ll/(t,'Zo,u) 

for all t and all inputs u (where we use tildes for the i/^ map of the second system). Then, there exists 
an isomorphism p (that is. Equation ^ holds) such that also pizo) = 'zq. To apply this theorem 
to our case, we need to show that the modified system, with G{z, u) = F(z, ttu), is irreducible (it 
is clearly analytic, since the original system is) and we must find, for our original system and for 
G(z, u) = F{z, nu), two respective initial states zo and zj) that lead to the same input/output behavior. 
This latter requirement is achieved by taking zo = cr{u) for any fixed u, and zj) = crinu) for the same 
u. The definition of ^-symmetry is precisely the statement that the two input/output behaviors are 
identical. We next verify that the modified system with G{z, u) = F(z, nu) is irreducible. 

Accessibility: Since n is onto, the set of vector fields {F{-, u), w e U} is included in the set of vector 
fields {F{-,nu),u e U}. Thus, the accessibility Lie algebra of the modified system can only be 
larger, and hence the accessibility property for the original system guarantees that of the modified 
system. 

Observability: Suppose that two different states zi,Z2 give rise to different outputs in the original 
system, when the input is u{t). Since n is onto, there is an input v such that u(t) = 7T(v{t)) for all 
t. This means that the outputs corresponding to the initial states zi and zi in the modified system, 
under the input v, will be different. This argument can be applied for any two distinct states. Thus, 
the observability property for the original system guarantees that of the modified system. 

This completes the proof of the main theorem. ■ 



Remark 4.1 The intuitive idea of the construction of p is not difficult to understand, and is stan- 
dard in control theory js 37-39|: given an initial state x° = o-(u), input u, and time ? > 0, we 
look at the state z = (p(t, u) and the state T = (p(t, x^, nu), and define p(z) = Z The accessibility 
property says that this map is defined on an open subset of the state space. The interlacing property 
represented by Fig.|5]is checked by seeing that the states p{z) ■ Ut and z ■ nut are not distinguish- 
able by any input/output experiments, and hence by observability must be the same. The technical 
difficulties arise in proving the differentiability of p and its definition on the entire state space, not 
merely a subset, and this was one of the main contributions of p7|. □ 
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Remark 4.2 In the proof of the theorem, we only needed to know the existence of some two 
initial states zo and 'zq that lead to identical behaviors. This means that, if we define a "weakly 
invariant" system as one for which there exists some constant u such that ([2]) holds: (t{u), u) = 
il/{t, cr{7Tu), nu) for all inputs u and all ? > (instead of asking that this holds for every m), then weak 
invariance implies the existence of an equivariance, and hence also invariance. (The irreducibility 
property plays a subtle role in this argument.) □ 



5 Stability result for nonlinear integral feedback systems 

We wish to show the global asymptotic stability (GAS) of the unique steady-states {yyl{fil^),yo) 
and {yy/(J3fJ.),yo) respectively, of the two nonlinear integral feedback systems in Fig.[T| 

i = ax(y-yo) ^ = ax(yo-y) 

y = p--yy y = fiux-yy 

for any constant input u, where we assume that x{t) > and u > 0. 

Lemma 5.1 Consider any two-dimensional system evolving on having the following "nonlin- 
ear damping" form: 

X = g(y) 

y = -f(x)-k(y), 

where f,g,k are functions that have positive derivatives. Suppose that (xo,yo) is a steady state. 
Then this is the unique steady state of the system, and it is globally asymptotically stable. 

Corollary 5.2 Consider any two-dimensional system evolving on R>oxR and having the following 
form: 

X = xg(y) 

y = -fix)-k(y) 

where k has positive derivatives and either (a) both / and g have positive derivatives or (b) both 
/ and g have negative derivatives. Suppose that {xo,yo) is a steady state. Then this is the unique 
steady state of the system, and it is globally asymptotically stable. 

Proof. Suppose first that both / and g have positive derivatives. We transform the system into one 
in R^, using variables y and z = lnx: 

z = liy) 

y = -fiz)-k(y) 

where ^(j) = g{y) and f{z) '■= f{e^)- This functions defining this system have positive derivatives, 
as required in Lemma [5TT| Moreover, the transformed system has the steady state (In XqjJo). which 
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is therefore globally asymptotically stable (and unique). Transforming back to the original coordi- 
nates, we proved our claim. Now suppose that case (b) holds instead: both / and g have negative 
derivatives. We pickz := -\nx. The equations transform as above, except that now^(y) = -g{y) 
and f(z) := f(e~^). Since these now have positive derivatives, the same argument as in case (a) 
applies. ■ 

The nonlinear integral feedback systems discussed earlier are particular cases of the above form. 
The linear function k{y) = yy is increasing. When g{y) = ax(y - yo), g is increasing, and when 
g(y) = ax(yo -y), g^^ decreasing. The function f(x) = -/3u/x (where u is any positive constant) is 
increasing, while f{x) = -/3ux is decreasing. 



Proof of Lemma 5.1 Uniqueness: since g is strictly increasing, y^ is uniquely determined, and 
it then follows that xq is also uniquely determined (because / is increasing) as the solution of 
fix) = -k{yo). 

At the steady state (xo,yo), we may assume, without loss of generality that /(xo) = and k{yo) = 
(as well as ^Cvo) = 0)- Indeed, let c := f(xo) = -k(yo). Redefining f(x) := f{x) - c and 
kiy) := k{y) + c,_we have that /(xo) = and kiyo) = 0, and the differential equations have not 
changed, since -f{x) - kiy) = -{f{x) - c) - {kiy) + c) = -f{x) - kiy). Since / is strictly increasing, 
its values are positive when x > Xq and negative otherwise, and similarly for g, k with respect to y^. 
Let us define ^ 

Vix,y) := r fir)dr + f gir)dr. 

JxQ Jyo 

By definition, Vixo,yo) = and Vix,y) > for all ix,y) ^ (xo,jo) As g = fix) > 0, g = 
g'iy) > 0, and mixed second derivatives are zero, it follows that the Hessian matrix of V is positive 
definite everywhere. Thus, V is strictly convex, and it follows that V is a proper function: Vix, y) 
oo as ||(.x;, y)\\ oo. In conclusion, V is a Lyapunov function candidate function. To conclude global 
stability based on the LaSalle Invariance Principle [5 1, we must show that the derivative of V along 
trajectories: 

dV dV 

vix,y) := -^(^,y)8(y) + [-/W - %)] 

has the properties that Vix,y) < for all ix,y) and that ixit),yit)) = ixo,yo) is the only solution 
with Vixit),yit)) = 0. Now, 

Vix,y) = fix)giy)+giy)[-fix)-kiy)] = -giy)kiy) < 

because g and k have everywhere the same sign (positive if > ^'o. negative ify< yo). Suppose if 
a solution satisfies that Vixit),yit)) = 0, then yit) = yo, so that also yit) = 0, which substituted into 
the second equation gives = -fixit)) - 0, which implies that xit) = xq. ■ 

Observe that the only place that k appears in the proof is in the statement that g and k have 
everywhere the same sign; thus the same proof works if instead of assuming that k has everywhere 
positive derivatives, we assume that /(jcq) = and iy - y^jkiy) > for all y yo. Note that V 
would be a Hamiltonian for the system if = 0, so the system can be thought simply as adding 
damping to a conservative system. 



19 



6 Comparing feedforward and feedback structures 



We make several remarks in this section concerning the relations and comparisons between 
adapting feedforward and feedback architectures. 



6.1 Internal model principle 

The "internal model principle" (IMP, for short) in control theory states that one should be able 
to recast any system which adapts to steps as a system which integrates an "adaptation error" 
signal (integral feedback). For example, it should be possible to rewrite the feedforward system in 
Fig. [2]; a) in such a manner. In this section, we review one precise statement of the IMP, and apply 
it to this example. 

Adaptation is called "disturbance rejection" in control theory [5| (not to be confused with a 
different topic, "adaptive control"). A key mathematical idea, the internal model principle (IMP), 
states that, to be able to adapt to all signals in a given class of inputs "1/", the system must include 
an "internal model": a subsystem which is driven by the "error" in adaptation, and whose solutions 
when the error is zero (that is, when the system has perfectly adapted) are the possible signals 
in If. Intuitively, an internal representation of the external signal is memorized, and adaptation 
performance is constantly evaluated; any error in adaptation is used to form a better estimate of 
this external signal. For example, for adaptation to constant signals, the IMP requires integral 
feedback, as in Fig. [T] 1/ = all constant signals, the error is j - yo, and solutions of i = 
are precisely the constant signals. In systems biology, the IMP suggests biochemical structures, 
thus guiding modeling and experiments as well as interpretation of the role of various regulatory 
and signal processing motifs. For instance, the relevance of the IMP to E. coli chemotaxis was 



remarked in [42|: the methylation state can be viewed as a memory (integrator) and the "error" is 
the average kinase activity relative to its basal value. The IMP encompasses adaptation also with 
respect to richer classes of signals 14, not just constant ones. For example, one might speculate f43l 
that circadian rhythms might have evolved as an IMP mechanism to allow adaptation to day /night 
light and temperature cycles: a harmonic oscillator with period T is predicted by the IMP when 
a system adapts to 1/ = all signals of the form A sm{2nt/T + (p). The IMP was proved as a 
theorem for linear dynamics by Francis and Wonham in the mid 1970s p4l|45| . It remains an open 
problem to find ultimate nonlinear generalizations, but there are some partial extensions known. 
For example, using "zero dynamics" ideas from f46\, a theorem was given in [43 1 that shows, 
under appropriate technical assumptions, the existence of coordinate changes, generally nonlinear, 
that exhibit an internal model. Using this theorem, one should expect to find coordinate changes 
transforming IFFL circuits into integral feedback form, and this is indeed true. We first describe 
the comparatively trivial case of linear systems and then discuss how to obtain an analogous result 
for nonlinear IFFL systems. 

As a simple first illustration, consider the following feedforward linear system: 

X = -X + u , y = -x-y + u, 

which perfectly adapts to y = yo = but is not in integral feedback form. We may perform a simple 
change of coordinates, representing the system using the state variables (x = x - y,y) instead of 
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the original (x,y). In this new set of coordinates, we have: 

x = y, y = -x-2y + u, (9) 
which is now an integral feedback system (the variable z integrates the "error" y). We next review 



the main theorem from |43 1, and then work out the application to the nonlinear feedforward system 
inFig.[2];a). 



The general setup in |43| is as follows. The systems studied are scalar- input scalar-output n- 



dimensional systems for which the input appears to first order: 

z = f(z) + ug(z), y = h(z). (10) 

The vector fields / and g are smooth, and his a smooth function. We assume that z = is a steady 
state when u = 0, /(0, 0) = 0. 



We will say that the system (10) adapts to inputs in a class U if for each u e U and each 



initial state 6 R", the solution of (10) with initial condition x{0) = x^ exists for all ? > and 
is bounded, and the corresponding output y{t) = h{x{t)) converges to a fixed value yo (which 
does not depend on the particular input u 6 lA) as ? ^ oo. 

In control theory, it is standard to describe the class of inputs lA with respect to which adaptation 
holds through the specification of an "exosystem" that produces these inputs. An exosystem is 
simply any autonomous system F: 

w = Q(w), u = 9{w) (11) 

with the following property: the input class 14 consists exactly of the functions u(t) = 6{w(t)), 
t > 0, for each possible initial condition w(0). For example, if we are interested in step responses, 
we pick w = 0, u = w. This means that the possible signals are the solutions of w = 0, i.e. the 
constant functions of time; that is, 1/ is the set of functions u{t) for which u(t) = u for all t for 
some M 6 U. On the other hand, if we are interested in sinusoidals with frequency oj then we use 
X\ = X2, X2 — —CO'^Xi, U = Xi- A technical assumption is that the signals in 1/ do not grow without 
bound. Specifically, one assumes that the exosystem is Poisson-stable, meaning that for every state 
w", the solution w(-) of vv = Qiw), w(0) = is defined for all ? > and it satisfies that w° is in 
the omega-limit set of w (recall that this means that there is a sequence of times t, oo such that 
the sequence w(?,) converges to w" as t ^ oo). In other words, the exosystem is almost-periodic 
in the sense that trajectories keep returning to neighborhoods of the initial state. Both the constant 
and sinusoidal examples mentioned above are generated by Poisson-stable systems. In contrast, 
ramps (linearly growing signals) are not generated by Poisson-stable systems, since they require an 
unstable second-order system wi = 0, wi = W2, u = wi to generate them. Thus, the phenomenon 
of adaptation to ramps is not included in the scope of the theorem to be stated. The exosystem is 
assumed to have states that evolve on some differentiable manifold, 2 is a smooth vector field, and 
is a real- valued smooth function. 



The IMP claims that a copy of this exosystem must be embedded in the system (10). More 



precisely, one says that the system contains an output-driven internal model of 14 if there is a 



change of coordinates which brings the equations (10) into the following block form: 

Z\ = fl(Zl,Z2) + UgiizuZl) 

Z2 = f2(y,Z2) (12) 
3^ = k(zi) 
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so that the subsystem with state variables zi is capable of generating all the possible functions in 
1/: for some some function ip(j.-i), and for each possible u elA, there is some solution of 

Z2=f2iyo,Z2) (13) 

which satisfies ^feCO) = u{t). "Change of coordinates" means that there is some integer r < n and 
two differentiable manifolds Zi and Z2 of dimensions r and n - r respectively, as well as a smooth 
function /c : Zi ^ R and two vector fields F and G on Zi x Z2 which take the partitioned form 

fliZl,Z2) \ ^ ^(g\(ZuZ2)\ 
UWZl),Z2)/' \ / 

and a diffeomorphism O : R" ^ Zi x Z2, such that 

0'(x)/(x) = F(0(x)) , (^'(x)g(x) = G(0(x)) , ^(Oi(x)) = h(x) 

for all X € R", where Oi is the Zi -component of O and prime indicates Jacobian. Intuitively, the 
signal Z2 computes an integral of a function of the output y(t), and when y(t) = yo, Z2 is (up to 
the mapping cp, which may be interpreted as a sort of rescaling) a signal in U. For example, if U 
consists of constant functions (adaptation to steps), then for 3; = yo one obtains (for different initial 
conditions) the possible constant signals. 

In order to prove a theorem justifying the IMP, several technical conditions are imposed in [43 1. 
The first is a signal detection or "sensitivity" property: (1) for some positive integer r, called in 
control theory a finite uniform relative degree, 

LgL'}h = 0, k = 0,...,r-2 and LgUf^h{x)i-Q VjcgZ. 

As in the section on observability, generally, LxH denotes the directional or Lie derivative of a 
function H along the direction of a vector field X: {LxH)(x) = VH(x) ■ X{x), and one understands 
LyLxH as the iteration LyiLjcH). (In the special case that Lgh{x) + for all x, the relative degree 
is r = 1, since the condition for A; < r - 1 is vacuous.) Given that the relative degree is r, one may 
consider the following vector fields: 

= J Tr\,, ^ S(x'> ' = f^^'> ~ {L''Mx))g(x) , Ti:= ad'~% z = 1, . . . r , 

where ad^ is the operator adxY = [X, Y] = Lie bracket of the vector fields X and Y, and adj' is the 
iteration of this operator i - 1 times (when z = 1, r, = One says that a vector field X is complete 
if the solution of the initial value problem x = X(x), x{0) = is defined for all t and for any initial 
state jc". Two vector fields X and Y are said to commute if [X, Y] = 0. The final assumptions, then, 
are that (2) r, is complete, for z = 1, . . . , r and (3) the vector fields t, commute with each other. (In 
the special case r = 1, condition (3) is automatic, since every vector field commutes with itself.) 
These assumptions are satisfied for linear systems. They are also satisfied, for example, for the 
feedforward system in Fig. ^a): 

u 

X = au - 6x , y = /3 yy (14) 

X 
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with h(x,y) = y. In vector form, this is z = f{z) + ug{z), where the vector fields are: 

/(^.3') = ( :^^) and ^(^,J) = (^/J- (15) 

Since Lgh = (0, 1) • (a,/3/x)^ = /3/xis everywhere nonzero, we have that r = 1. Thus we only need 
to check that 

Lgh(x) p \ 1 

is complete, which is true because ^ is a linear vector field. 



The main theorem in [43] says: Suppose that assumptions (l)-(3) hold for the system (10). 
If {10) adapts to inputs in a class lA generated by a Pois son- stable exosystem, then it contains an 
output-driven internal model oflA. 

The proof of the theorem consists of showing that there is, under the stated conditions, a change 
of variables as claimed. The map producing the change of variables is obtained by solving a first- 
order partial differential equation. 



6.2 Illustration of IMP for the feedforward system in Fig. |2](a) 



We consider the system ( 14), or ( 15 1 in vector form. We already checked properties (l)-(3), and 
the system adapts to steps (constant inputs), so the theorem says that it should be possible to to 
recast it integral feedback form. The proof in [|43] asserts the existence of a mapping ip{x,y) whose 
Lie-derivative along g solves the following first-order linear PDE: 

Lg<p = V(fg = aipx{x,y) + -ipy{x,y) = 0. 

X ' 

Generally, such an equation may be solved using the method of characteristics. However, in our 
example the solution is immediate: (p{x, y) = ay - /3 log x. The map 

(x,y) (zi,Z2) = iy,(pix,y)) = (y, ay - filogx) 

is a diffeomorphism whose inverse is y = zi and x = e^"~'~^^^^^. We obtain the following equations 
in the new coordinates (zi,Z2): 

Zi = pue^''-"''^"^ -yzi 
Zi = 1^5- ayz\ 

with output y = z\. This has the desired internal model form z\ = /i(zi,Z2) + "^1(21,^2). = 
f2(y,Z2), y = K{zi), if we define: fdzuZ2) = -yzu gi(zuZ2) = Pe^^^--"'^^^IP , f2(y,Z2) = f2(y) = 
/36 - ayy, and k = identity. Thus Z2 is the variable that integrates the error: when y = yo = I, the 
equation for Z2 becomes Z2 = 0, whose solutions are all the possible constant signals. We can also 
write this system in terms of the coordinates x = e^^^^, y = zi as follows: 

(ay \ _a 
6- —yj , y =/3uxe p- - yy (16) 

which has the generic form x = xF(y), y = G{ux, y) of nonlinear integral feedback systems consid- 
ered in Lemma [XT] 
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6.3 What are the relative advantages of different architectures? 



Both integral feedback and IFFL circuits allow adaptation, as well as, for appropriate models, 
symmetry invariance to scalings. Thus, it is natural to ask what are the relative merits of each of 
these architectures: what fitness-conferring signal processing and control properties are special for 
them? We view this as a question for further research, and limit ourselves here to a few remarks. 

In a certain sense, the question is meaningless, since feedforward networks can often be simu- 
lated by feedback ones, as just shown. Nonetheless, the variable Ic = x - y may well be merely a 
mathematical construct with no biological meaning. In addition, any (linearized) system obtained 
by such a transformation from an IFFL is special: it can have only real eigenvalues, while the more 
general integral feedback form may have damped oscillatory behavior (depending on parameters). 

This means that the feedback systems have a wider range of possible dynamical behaviors, and 
thus might be selected when it is desirable to meet specific performance objectives. In addition, 
feedback confers a certain robustness to uncertainty. We illustrate this point by comparing the 
feedforward system in Fig. [2]; a): 

u 

X = au - ox , y = j3 yy 

X 

with its recasting in feedback form: 

I ^ ay \ ^ _s.y 

X = x\6 - ~^y\ ' y - puxe - yy . 

Both systems adapt (y(t) —). However, while in the feedback form, any perturbation of the 

— — V 

right-hand side: y = puxe - yy + A(x, y) does not alter the property that the steady state must 
have y = no analogous simple statement can be made for the feedforward form. 

Conversely, one may also speculate that biological or evolutionary could constraint the value 
of feedback structures. As an example, the stability of feedback (but not feedforward) systems is 
fragile to delays. Delays could arise from slower time scales for processes such as transcription and 
translation, compared to protein modifications. As an illustration. Fig. [6] show oscillations arising 
from a delay from 3^ to .x; in the linear integral feedback system (|9]) and Fig. |7] show oscillations 
arising from a delay from _y to a; in the nonlinear integral feedback system ([16]). 



7 Invariant steering 

As remarked in fSl, motile systems that measure a field in order to determine their velocity of 
movement have the property that their entire search patterns, as a function of time, are invariant 
to scale, if their sensory systems have the FCD property. We now discuss a precise formulation of 
this fact for arbitrary systems and symmetries. 

We think of a system that, through its output y{t), drives a steering mechanism ("motor com- 
plex"), resulting in a new position r{t). We model this by a system with inputs, which is a way of 
saying that the position is computed by a dynamical system that keeps track of the past history of 

y- 

q = Q(q,y), r = R{q) 
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Figure 6: Oscillations in x(t) = y(t - h), y{t) = -x(t) - 2y{t) + u{t). Using w = and h = 5. 
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Figure 7: Oscillations in x{t) = x{l -y{t - h)), y{t) = x{t)u{t)e ^^'^ - y(t). Using m = and h = 5. 



where q{t) is the internal state of the steering mechanism. At position r(t) in space, an "intensity" 
(e.g., light or nutrient concentration) is queried, and the result is a sensed input I(t,r(t)). The 
intensity field I(t, r) could well be time as well as space-dependent. Finally, the loop is closed 
by the system measuring I{t, r{t)), except that we are interested in understanding how the system 
behaves if it measures nl{t, r(t)) instead of I(t, r{t)), where tt 6 f is a symmetry. See Fig. [8] for 
an illustration. We want to study the invariance of behavior of this system, and in particular of 
its position r{t) as a function of time, under the assumption that the system had pre-adapted to a 
constant environment before I{t, r) = Iq when t < before being placed in the current environment. 

Formally, we start with a system that adapts and is invariant with respect to a set of symmetries 
P, and consider the following extended system: 



z = F(z, u) u = I(t, r) 

q = Q{q,y) y = h(z),r 



R{q). 



We let (z, q) be the solution with initial conditions z(0) = cr(/o) and ^(0) = qo, where 2(^0, jo) = 0, 
that is, qo is a steady state that corresponds to the adaptation value yo of the original system. 
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u(t) = sensed input 



symmetry — -(71 



system 



y(t) = steering command 



steering 



I(t,r) = unsealed intensity 



at position r 



query 



r(t) = position 



Figure 8: Closed-loop diagram for search under symmetry uncertainty for inputs 



Also, for any given n e f, v/e consider the solution (z,^) of the system with initial conditions 
z(0) = cr(nIo) and ^(0) = ^0 and intensity field nl(t, r): 

'Z = F(Z,U) U = 7Tl{t,7) 

q = Q(q,y) y = h(z),7 = R(q) . 

We claim that 7(0 = y{f),u{f) = nu(t),T{t) = r{t), and^(0 = q(t) for all t > 0. 
To prove the claim, we consider the solution of the system 

X = F{x, nu(t)) 
s = Q(s,Kx)) 

with initial conditions x(0) = cr(7r/o) and ^(0) = qo. By definition of ^-invariance, we know that 
h{x{t)) = h(z(t)) for all t > 0. Thus, since the initial conditions on q and s are the same, it follows 
that also s(t) = q{t) for all t > 0. Therefore, u{t) = I{t,r{t)) = I{t,R{q{t))) = I{t,R{s{t))) for all 
? > 0. It follows that X = F{x, nl(t, r)). We conclude that (x, s) and (z,'q) solve the same initial- 
value problem, and thus x(t) = 'z(t) and s{t) = 'q{t) for t > 0, from which y(t) = h{z{t)) = h{x{t)) = 
h(z{t)) = and the claim is proved. 



Remark 7.1 A converse of this result holds as well. Suppose that, for every possible field /, 
the above system results in the same r(t) when started from z(0) = (t(/o) as when starting from 
z(0) = o-(nlo) but using input nl{t, r). This property then holds, in particular, when the field I{t, r) 
is independent of position r, that is, / is merely an arbitrary open-loop input. We would like to 
conclude that y{t) = 'y(t) from r(t) = p(t), which means, since the input is arbitrary, that the original 
system must be invariant under n. This conclusion will hold provided that the steering system has 
the following property: if we solve q = Qiq,y) with initial condition ^(0) = qo and two inputs yi 
and y2, and the resulting solutions satisfy R(qi(t)) = R(q2{T)) for t > 0, then yi(t) = y2{t) for t >0. 
In control-theoretic terms, this input reconstruction property is stated as the requirement that the 
system q = Q{q, y) with output R{q) be input-observable, which is a property closely related to 
right-invertibility, inverse dynamics, and "output to input observability" [46fj47J. Observability is 
almost enough to guarantee input-observability: if the system is observable, then qi{t) = qjit) for 
f > 0, so it is only needed that Q(q(t),yi(t)) = Q(q(t),y2(t)) imply yi(t) = yiit), which is a weak 
nondegeneracy property. □ 



Remark 7.2 In many applications, the system output y(t) drives a stochastic steering mechanism: 
the system producing the location r(t) is subject to randomness. For example, in bacterial chemo- 
taxis, y(t) may represent a signal, such as the level of phosphorylated protein CheY, which serves 
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to bias the random switches between tumbling and swimming behavior. One way to represent 
this probabihstic behavior is to model the dynamical system that computes the position from the 
history of y(t) as: 

q = Q{q,y,X), r = R{q) 

where X is a random process: X = {Xt(oj), ? > 0} is defined on a probability space (Q., T , P), where 
Q is a sample space, !F is a cr-algebra of events, and P is a probability measure; P(X=v(t)) is 
the probability of a given outcome (sample path) of this process. Under such a formaUzation, 
and assuming appropriate technical conditions, all the variables (z(t), u{t), r{t), q(t),y(t)) are them- 
selves stochastic processes defined on the same probability space Q.. (Technical conditions need 
to be imposed to insure the existence of solutions of the differential equations. We proceed intu- 
itively, assuming that X has piecewise continuous sample paths, thus avoiding complications of 
Ito calculus.) Now, given any fixed w e Q, we may view q = Q(q,y,X) as a time-varying system 
^(0 = Q(<l(t),y(t)^ 0> where we have substituted X = Xt(a>) along this sample path. The previous 
proof extends, with no changes, to such a time-varying system. It follows that identical r(t) (as 
well as y(t) and q(t)) are obtained, whether using the field / or using the changed field nl, so long 
as the initial condition z(0) had also been modified by n. This holds for each a> e Q.. Thus, as a 
random variable, r(t) is invariant under P. In particular, all statistics of r remain invariant. □ 
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